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Abstract 



Imitating successful behavior is a natural and frequently applied approach to trust in when facing sce- 
narios for which we have little or no experience upon which we can base our decision. In this paper, we 
consider such behavior in atomic congestion games. We propose to study concurrent imitation dynamics 
that emerge when each player samples another player and possibly imitates this agents' strategy if the an- 
ticipated latency gain is sufficiently large. Our main focus is on convergence properties. Using a potential 
function argument, we show that our dynamics converge in a monotonic fashion to stable states. In such a 
state none of the players can improve its latency by imitating somebody else. 

As our main result, we show rapid convergence to approximate equilibria. At an approximate equilib- 
rium only a small fraction of agents sustains a latency significantly above or below average. In particular, 
imitation dynamics behave like fully polynomial time approximation schemes (FPTAS). Fixing all other 
parameters, the convergence time depends only in a logarithmic fashion on the number of agents. 

Since imitation processes are not innovative they cannot discover unused strategies. Furthermore, strate- 
gies may become extinct with non-zero probability. For the case of singleton games, we show that the 
probability of this event occurring is negligible. Additionally, we prove that the social cost of a stable state 
reached by our dynamics is not much worse than an optimal state in singleton congestion games with linear 
latency function. Finally, we discuss how the protocol can be extended such that, in the long run, dynamics 
converge to a Nash equilibrium. 

1 Introduction 

We study imitation dynamics that emerge if myopic players concurrently imitate each other in order to im- 
prove on their own situation. In scenarios for which players have little or no experience upon which they can 
base their decisions, or in which precise knowledge about the available options and their consequences is ab- 
sent, it is a good strategy to imitate successful behavior. Thus, it is not surprising that such imitating behavior 
can frequently be observed, and has already been studied intensively in economics and game theory [20, 27]. 

We analyze such imitation dynamics in the context of symmetric congestion games [24]. As an example 
of such a game consider a network congestion game in which players strive to allocate paths with minimum 
latency between the same source-sink pair in a network. The latency of a path equals the sum of the latencies 
of the edges in that path and the latency of an edge depends on the number of players sharing it. 

"This work was in part supported by the DFG through German UMIC-excellence cluster at RWTH Aachen University. 
' Supported by an NSERC grant. Part of this work was done while author visited RWTH Aachen University, 
t Supported by the German Academic Exchange Service (DAAD) within the PostDoc-Program. 
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We consider a simple imitation rule according to which players strive to improve their individual latencies 
over time by imitating others in a concurrent and round-based fashion. This IMITATION PROTOCOL has 
several appealing properties: it is simple, stateless, based on local information, and is compatible with the 
selfish incentives of the players. The IMITATION PROTOCOL consists of a sampling and a migration step. 
First, each player samples another player uniformly at random. Then he considers the latency gain that he 
would have by adopting the strategy of the sampled player, under the assumption that no-one else changes 
his strategy. If this latency gain is not too small our player adopts the sampled strategy with a migration 
probability mainly depending on the anticipated latency gain. The major technical challenge in designing 
such a concurrent protocol is to avoid overshooting effects. Overshooting occurs if too many players sample 
other players currently using the same strategy, and if all of them migrate towards it. In this case their latency 
might be greater than before the migration. In order to avoid overshooting, the migration probabilities have 
to be defined appropriately without sacrificing the benefit of concurrency. We propose to scale the migration 
probabilities by the elasticity of the latency functions in order to avoid overshooting. The elasticity of a 
function at point x describes the proportional growth of the function value as a result of a proportional growth 
of its argument. Note that in case of polynomial latency functions with positive coefficients and maximum 
degree d the elasticity is upper bounded by d. 

A natural solution concept in this scenario is imitation-stability. A state is imitation-stable if no more 
improvements are possible based on the IMITATION PROTOCOL. We analyse convergence properties with 
respect to this solution concept. 

1.1 Our Results 

As our first result we prove that the IMITATION PROTOCOL succeeds in avoiding overshooting effects and 
converges in a monotonic fashion (Section 3). More precisely, we show that a well-known potential function 
(Rosenthal [24]) decreases on expectation as long as the system is not yet at an imitation-stable state. Thus, 
the potential is a super-martingale and eventually reaches a local minimum, corresponding to an imitation- 
stable state. Hence, as a corollary, we see that an imitation-stable state is reached in pseudopolynomial time. 

Our main result, presented in Section 4, however, is a much stronger bound on the time to reach approx- 
imate imitation-stable states. What is a natural definition of approximately stable states in our setting? By 
repeatedly sampling other agents, an agent gets to know the average latency of the system. It is approxi- 
mately satisfied, if it does not sustain a latency much larger than the average. Hence, we say that a state is 
approximately stable if almost all agents are almost satisfied. More precisely, we consider states in which at 
most a (^-fraction of the agents deviates by more that an e-fraction (in any direction) from the average latency. 
We show that the expected time to reach such a state is polynomial in the inverse of the approximation pa- 
rameters S and e as well as in the maximum elasticity of the latency functions, and logarithmic in the ratio 
between maximum and minimum potential. Hence, if the maximum latency of a path is fixed, the time is 
only logarithmic in the number of players and independent of the size of the strategy space and the number 
of resources. 

We complement these results by various lower bounds. First, it is clear that pseudopolynomial time is 
required to reach exact imitation-stable states. This follows from the fact that there exist states in which 
all latency improvements are arbitrarily small, resulting in arbitrarily small migration probabilities. Hence, 
already a single step may take pseudopolynomially long. As a concept of approximate stable states one could 
have required all agents to be approximately satisfied, rather than only all but a (5-fraction. This, however, 
would require to wait a polynomial number of rounds for the last agent to become approximately satisfied, 
as opposed to our logarithmic bound. Finally, we consider sequential imitation processes in which only one 
agent may move at a time. We extend a construction from [1] to show that there exist instances in which the 
shortest sequence of imitations that leads to an imitation-stable state is exponentially long. 

The Imitation Protocol has one drawback: It is not innovative in the following sense. It might happen 
with small but non-zero probability that all players currently using the same strategy P migrate towards other 
strategies and no other player migrates towards P. In this case, the knowledge about the existence of strategy 
P is lost and cannot be regained. For singleton games, i. e., games in which each strategy is a singleton set, 
in which empty links have latency zero, we show in Section 5 that the probability of this event occurring in 
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a polynomial number of rounds is negligible. This also has an important consequence: The cost of a state 
to which the IMITATION PROTOCOL converges is, on expectation, not much worse than the cost of a Nash 
equilibrium. More precisely, we show for the case of linear latency functions that the expected cost of a state 
to which the IMITATION PROTOCOL converges is within a constant factor of the optimal solution. 

We conclude with a discussion of a possible extension of the IMITATION PROTOCOL in Section 6. In 
cases, in which convergence to a Nash equilibrium is required, it is possible to adjust the dynamics and 
occasionally let players use an EXPLORATION PROTOCOL. Using such a protocol, players sample other 
strategies directly instead of sampling them by looking at other players. We show that a suitable definition of 
such a protocol and a suitable combination with the IMITATION PROTOCOL guarantee convergence to Nash 
equilibria in the long run. 

To the best of our knowledge, this is the first work that considers concurrent protocols for atomic conges- 
tion games that are not restricted to parallel links and linear latency functions. 

1.2 Related Work 

Rosenthal [24] proves that every congestion game possesses a Nash equilibrium, and that better response 
dynamics converge to Nash equilibria. In these dynamics players have complete knowledge, and, in every 
round, only a single player deviates to a better strategy than it currently uses. Fabrikant et al. [11], however, 
observe that, in general, from an appropriately chosen initial state it takes exponentially many steps until 
players finally reach an equilibrium. This negative result still holds in games with e-greedy players, i. e., in 
games in which players only deviate if their latency decreases by a relative factor of at least 1 + e [1,7, 26]. 
Moreover, Fabrikant et al. [11] prove that, in general, computing a Nash equilibrium is PLS-complete. Their 
result still holds in the case of asymmetric network congestion games. In addition, Skopalik and Vocking [26] 
prove that even computing an approximate Nash equilibrium is PLS-complete. On the positive side, best 
response dynamics converge quickly in singleton and matroid congestion games [1,21]. Additionally, Chien 
and Sinclair [7] consider the convergence time of best response dynamics to approximate Nash equilibria in 
symmetric games. They prove fast convergence to approximate Nash equilibria provided that the latency of 
a resource increases by at most a factor for each additional user. Finally, Goldberg [18] considers a protocol 
applied to a scenario where n weighted users assign load to m parallel links and the latency equals the load of 
a resource. In this protocol, randomly selected players move sequentially, and migrate to a randomly selected 
resource if this improves their latency. The expected time to reach a Nash equilibrium is pseudopolynomial. 
Results considering other protocols and links with latency functions are presented in [9] 

The social cost of (approximate) Nash equilibria in congestion games has been subject to numerous stud- 
ies. The most prominent concept has been the price of anarchy [22], which is the ratio of the worst cost of any 
Nash equilibrium over the cost of an optimal assignment. Roughgarden and Tardos [25] conducted the first 
study of general, non-atomic congestion games and showed a tight bound of 4/3 for the price of anarchy with 
linear latency functions. For atomic games and linear latencies, Awerbuch et al. [2] and Christodoulou and 
Koutsoupias [8] show a tight bound of 2.5. The special case of (weighted) singleton games has been of partic- 
ularly strong interest, and we refer the reader to [23, chapter 20] for an introduction to the numerous results. 
In terms of dynamics, Awerbuch et al. [3] consider the number of best-response steps required to reach a 
desirable state, which has a social cost only a constant factor larger than that of a social optimum. They show 
that even in congestion games with linear latencies there are exponentially long best-response sequences for 
reaching such a desirable state. In contrast, Fanelli et al. [12] show that for linear latency functions there are 
also much faster best response sequences that reach a desirable state after at most 6(n log log n) steps. 

Recently, concurrent protocols have been studied in various models and under various assumptions. Even- 
Dar and Mansour [10] consider concurrent protocols in a setting where the links have speeds. However, their 
protocols require global knowledge in the sense that the users must be able to determine the set of underloaded 
and overloaded links. Given this knowledge, the convergence time is doubly logarithmic in the number of 
players. In [4] the authors consider a distributed protocol for the case that the latency equals the load that does 
not rely on this knowledge. Their bounds on the convergence time are also doubly logarithmic in the number 
of players but polynomial in the number of links. In [5] the results are generalized to the case of weighted 
jobs. In this case, the convergence time is only pseudopolynomial, i. e., polynomial in the number of users, 
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links, and in the maximum weight. Finally, Fotakis et al. [17] consider a scenario with latency functions for 
every resource. Their protocol involves local coordination among the players sharing a resource. For the 
family of games in which the number of players asymptotically equals the number of resources they prove 
fast convergence to almost Nash equilibria. Intuitively, an almost Nash equilibrium is a state in which there 
are not too many too expensive and too cheap resources. In [13], a load balancing scenario is considered in 
which no information about the target resource is available. The authors present an efficient protocol in which 
the migration probability depends purely on the cost of the currently selected strategy. 

In [15] the authors consider congestion games in the Wardrop model, where an infinite population of 
players carries an infinitesimal amount of load each. They consider a protocol similar to ours and prove that 
with respect to approximate equilibria it behaves like an FPTAS, i. e., it reaches an approximate equilibrium 
in time polynomial in the approximation parameters and the representation length of the instance (e. g., if the 
latency functions are polynomials in coefficient representation). In contrast to our work the analysis of the 
continuous model does not have to take into account probabilistic effects. 

Our protocol is based on the notion of imitation, a concept frequently applied in evolutionary game theory. 
For an introduction to imitation dynamics, see, e g., [20, 27]. 

2 Congestion Games and Imitation Dynamics 

In this section, we provide a formal description of our model. We define congestion games in terms of 
networks, that is, the strategy space of each player corresponds to the set of paths connecting a particular 
source-sink pair in a network. However, our results are independent of this definition and still hold in general, 
symmetric congestion games. Furthermore, we introduce the slope and the elasticity of latency functions, and 
give a precise definition of the IMITATION PROTOCOL. 

2.1 Symmetric Network Congestion Games 

A symmetric network congestion game is a tuple (G, (s, t),Af, (£ e )ee e), where G — (V, E) denotes a net- 
work with vertices V and m directed edges E, and s G V and t G V denote a source and a sink vertex. 
Furthermore, J\f denotes a set of n agents or players, and (£ e )eeE a family of non-decreasing and differ- 
entiable latency functions £ e : R>o — > M>o- We assume that for all e G E, the latency functions satisfy 
£ e (x) > for all x > 0. The strategy space of all players equals the set of paths V connecting the source s 
with the sink t. If G consists of two nodes s and t only, which are connected by a set of parallel links, then 
we call the game a singleton game. A state x of the game is a vector (xp)p e p where xp denotes the number 
of players utilizing path P in state x, and x e = J2p3e x p k trie congestion of edge e G E in state x. The 
latency of edge e in state x is given by £ e (x e ), and the latency of path P G V is 

£ P (x) = ^4(z e ) ■ 

eGP 

The latency of a player is the latency of the path it chooses. 

For brevity, for all P G V, let lp denote the m-dimensional unit vector with the one in position P. In 
state x a player has an incentive to switch from path P to path Q if this would strictly decrease its latency, 
i. e., if 

t P (x) >£ Q {x + l Q -l P ) . 

If no player has an incentive to change its strategy, then a; is at a Nash equilibrium. It is well known [24], that 
the set of Nash equilibria corresponds to the set of states that minimize the potential function 

*(*) = • 

eeE 2 = 1 



4 



In the following, let $* = min^ 3>(x) be the minimum potential. Note that due to our definition of the latency 
functions $* > 0. For every path P e V let 

£+{x) = £ P (x + 1 P ) . 

Note that for every path Q e V 

4( x ) >£p(x + i P -i Q ) . 

Additionally, let 

L w(x) = y~] —£p(x) 

7T~L n 

Per 

denote the average latency of the paths in state x, and let 

LUx) = J2^£p(x + 1 p ) . 
Per 

Finally, let £ max = max x maxp e p £p(x) denote the maximum latency of any path. Throughout this paper, 
whenever we consider a fixed state x we simply drop the argument (x) from $, £p, £p, L av , and L+. 

2.2 The Elasticity and the Slope of Latency Functions 

To bound the steepness of the latency functions and the effect that overshooting may have, we consider the 
elasticity of the latency functions. Let d denote an upper bound on the elasticity of the latency functions, i. e., 

(£'e(x)-x\ 

d > max sup < - - > . 

^ E xe(0,n] { £e{X) J 

Now given a latency function with elasticity d, it holds that for any x and a > 1, £ e (ax) < £ e (x) ■ a d and 
for < a < 1, £ e (ax) > £ e (x) ■ a d . As an example, the function ax d has elasticity d. 

For almost empty resources, we will also need an upper bound on the slope of the the latency functions. 
Let v e denote the maximum slope on almost empty edges, i. e., 

v e = max {£ e (x) — £ e (x — 1)} . 
xe{i,...,d} 

Finally, for PgP, let vp = J2 e eP v z an ^ c h° ose v sucn mat v > maxp e p up. 

2.3 The Imitation Protocol 

Our Imitation Protocol (Protocol 1) proceeds in two steps. First, a player samples another agent uni- 
formly at random. The player then migrates with a certain probability from its old path P to the sampled 
path Q depending on the anticipated relative latency gain (£p(x) — £q(x + 1q — lp))/£p(x) and on the 
elasticity of the latency functions. Our analysis concentrates on dynamics that result from the protocol being 
executed by the players in parallel in a round-based fashion. These dynamics generate a sequence of states 
x(0), x(l), . . .. The resulting dynamics converge to a state that is stable in the sense that imitation cannot 
produce further progress, i. e., x(t + 1) = x(t) with probability 1. Such a state is called an imitation-stable 
state. In other words, a state is imitation-stable if it is e-Nash with e = v with respect to the strategy space 
restricted to the current support. Here, e-Nash means that no agent can improve its own payoff unilaterally by 
more than e. 

As discussed in the introduction, the main difficulty in the design of the protocol is to bound overshooting 
effects. To get an intuition of this problem, consider two parallel links of which the first has the constant 
latency function £\{x) = c and the second has the latency function ^(x) = x d . Recall that the elasticity 
of £ 2 is d. Furthermore, assume that only a small number of agents x 2 utilizes link 2 whereas the majority 
of n — x 2 users utilizes link 1. Let b = c — x d > denote the latency difference between the two links. 



5 



Protocol 1 Imitation Protocol, repeatedly executed by all players in parallel. 



Let P denote the path of the player in state x. 

Sample another player uniformly at random. Let Q denote its path. 

if £ P (x) > £ Q (x + 1q-I p ) + v then 



with probability 



MPQ " d IpTx) 



A l P {x) -£ Q (x + l Q - 1 P ) 



migrate from path P to path Q. 
end if 



A simple calculation shows that using the protocol without the damping factor 1/d, the expected latency 
increase on link 2 would be 0(6 • d), overshooting the balanced state by a factor d. For this reason, we reduce 
the migration probability accordingly. The constant A will be determined later. 

Note that the arguments in the last paragraph hold for the expected load changes. Our protocol, however, 
has to take care of probabilistic effects, i. e., the realized migration vector may differ from its expectation. 
Typically, we can use the elasticity to bound the impact of this effect. However, if the congestion on an edge 
is very small, i. e., less than d, then the number of joining agents is not concentrated sharply enough around 
its expectation. In order to compensate for this, we add an additional requirement that agents only migrate 
if the anticipated latency gain is at least v and use this to bound probabilistic effects if the congestion of the 
edge is less than d. Let us remark that we will see below (Theorem 9) that for a large class of singleton games 
it is very unlikely, that an edge will ever have a load of d or less, so the protocol will behave in the same way 
with high probability for a polynomial number of rounds even if this additional requirement is dropped. 



In this chapter, we consider imitation dynamics that emerge if in each round players concurrently apply the 
Imitation Protocol. At first, we observe that imitation dynamics converge to imitation stable states since 
in each round the potential $(#) decreases in expectation. From this result we derive a pseudopolynomial 
upper bound on the convergence time to imitation-stable states. 

3.1 Pseudopolynomial Time Convergence to Imitation-Stable States 

Consider two states x and x' as well as a migration vector Ax = (Axp)p^p such that x' = x + Ax. We 
may imagine Ax as the result of one round of the IMITATION PROTOCOL although the following lemma is 
independent of how Ax is constructed. Furthermore, we consider Ax to be composed of a set of migrations 
of agents between pairs of paths, i. e., Axpq denotes the number of players who switch from path P to path 
Q, and Ax p denotes the total increase or decrease of the number of players utilizing path P, that is, 



Also, let Ax e = J2p3c Axp denote the induced change of the number of players utilizing edge e e E. In 
order to prove convergence, we define the virtual potential gain 



which is the sum of the potential gains each player migrating from path P to path Q would contribute to 
A$ if each of them was the only migrating player. Note that if a player improves the latency of his path, 
the potential gain is negative. The sum of all virtual potential gains is a very rough lower bound on the true 
potential gain A$(x, Ax) = + Ax) — In order to compensate for the fact that players concurrently 



3 Imitation Dynamics in Games with General Strategy Spaces 




Qev 



V PQ (x, Ax) = x PQ ■ {£ Q (x + 1q - 1 P ) - £ P (xj) 
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change their strategies, consider the error term on an edge e e E: 



F e (x,Ax) = < 



Y £ e (u) - £ e {x e + I) ifAx e >0 

u— cc e + l 



J2 £e{Xe)-£e(u) if Ax e < 

if Ax e = 



u=x e +Ax e +l 





Subsequently, we show that the sum of the virtual potential gains and the error terms is indeed an upper bound 
on the true potential gain A$(x, Air). A similar result is shown in [16] for a continuous model. 

Lemma 1. For any assignment x and migration vector Ax it holds that 

A$(x,Ax) < Y V PQ (x,Ax) + Y F e(x,Ax) . 

P,QeV e£E 

Proof. We first express the virtual potential gain in terms of latencies on the edges. Clearly, 

Y V PQ {x,Ax) = Y x P q-{£q{x + Iq-Ip)-Ip{x)) 



P,QEV 



p,Qev 



^ E xp Q~ \Y e ^ Xe + 1 ^~J2 e ^ x ^ 



P,QEV 



eeP 



< Y ^x e ■ £ e (x e + 1) + Y &x e -i e {x e ) 



(1) 



e:Aa; e >0 



e:Ax e <0 



The true potential gain, however, is 

A$(x,Ax) = 



x e -\-Ax e x e 

Y Y '.(«)- E E w 

e:Aa; e >0 «=a; c + l e:Aa; c <0 «=i e -Ai e + l 

(Xe+Axe \ 
Ax e -£ e ( Xe + l)+ Y (£e(u) - £ e (x e + 1)) 
u=x e + l J 

+ Y (Ax e -4(x e )+ fl (£e(Xe) ~ £e(u))) 



e:Ai c <0 



u—x e — Ax e + l 



Substituting Equation (1) for the left term of each sum and the definition of F e for the right term of each sum, 
we obtain the claim of the Lemma. □ 

In the following, we consider Ax to be a migration vector generated by the IMITATION PROTOCOL rather 
than an arbitrary vector. In this case, Ax is a random variable and all probabilities and expectations are taken 
with respect to the IMITATION PROTOCOL. In order to prove that the potential decreases in expectation, we 
derive a bound on the size of the error terms. We show that the error terms reduce the virtual potential gain by 
at most a factor of two, or, put another way, that the true potential gain is at least half of the virtual potential 
gain. 

Lemma 2. Let x denote a state and let the random variable Ax denote a migration vector generated by the 
Imitation Protocol. Then, 



E[A$(x,Ax)] < \ Y e [Vpq(x,As 



P,Q£V 
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Proof. For any given round, each term in Vpq, P,Q G V and F e , e G E can be associated with an agent. 
Fix an agent i migrating from, say, P to Q. Its contribution to the Vpq(x, Ax) is £q(x + 1q — lp) — £p(x) 
(this is the same for all agents moving from P to Q). It also contributes to F e , e G P U Q. The size of this 
term depends on the ordering of the agents. We will consider the migrating agents in ascending order of the 
migration probability /ip q , where Pj and Qj denote the origin and destination path of agent j, respectively. 
Ties are broken arbitrarily. 

Fix an edge e € E and let A + (e) and A~ (e) denote the set of agents migrating to and away from e G E, 
respectively. Let A(e) = A + (e) U A~(e). Let Ax e denote the contribution to Ax e of agents in A(e) which 
occur in our ordering with respect to fipQ before agent i. 




x x 

Figure 1 : Potential gain of an agent migrating from edge e' towards edge e. The hatched area is the agent's 
virtual potential gain. The shaded area on the left is this agents contribution to the error term, caused by the 
Ax e agents ranking before the agent under consideration (with respect to /ipQ). 

Agent i's contribution to F e (x, Ax) is A£ e (Ax e ) where we define the error function A£ e (6) = £ e (x e + 
1 + 5) — £ e (x e + 1)- For an illustration, see Figure 1. Note that there is an exception: If e G Q PI P, then the 
contribution of agent i to F e is zero and there is nothing to show. For brevity, let us write £ e = £ e (x e ) and 

= £ e (x e + 1) as well as £p = £p(x) and £q = £p(x e + 1q — 1 p). For e G Q \ P we show that 

1 



and for e G P \ Q, 



E 



E 



A£ e (Ax e ) 



< 




A£ e (Ax e ) < v-{Zp-£7 ) )-[i L + — 



£e 

lp 



Up 



(2) 



(3) 



Thus, the expected sum of the error terms of an agent migrating from P to Q is at most 




Cp VP \ ^„£n V Q 



i. e., half of its virtual potential gain, which proves the lemma. First, consider the case that e G Q where Q 
denotes the destination path of agent i. 
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For brevity, let us write Ipq — (£p — £q)/Ip for the incentive to migrate from P to Q. Again, consider 
the case that e G Q where Q denotes the destination path of agent i. Then, due to our ordering of the agents, 



W TA ~ 1 ^ Xe A • X e • I P Q 

E [Ax e ] <n- [Ipq < - 

n a 



implying 



X e > 



E [Ax e ] ■ d 



A • Ipq 

Furthermore, due to the elasticity of l e , and using (1 + l/x) x < cxp(l), we obtain 



(4) 



(5) 



Ai e {5) < It 



x e + 1 + sy 

X e + 1 



< e- i+- -it 



< /+-(e«f-l) . (6) 

Subsequently, we consider two cases. 

Case 1: E [Ax e ] > ^. Substituting Inequality (5) into Inequality (6), we obtain for every k G M - > Oq 

Ale («E [Ax e ]) < it ■ (e KA/f, « - l) . 

Now, note that for every k G N and kg [fc, k + 1] 

P [Ax e > kE [Ax e ]] < P[Ax e > fcE[Ax e ]] and 
A4( K E[Ax e ]) < A4((fc + l)E[A5 e ]) 

hold. Applying a Chernoff bound (Fact 16 in the appendix), we obtain an upper bound for the expecta- 



tion of E 



as follows. 



E 



Ale (Ax e ) 

A4(Ax e )l < ^>[Ax e > fcE[Ax e ]] • A£ e ((k + l)E[Ax e ]) 

k=l 

oo 

< Alt (5E [Ax e ]) + p > fcE [Ax e ]] • Al e ((k + 1) E [Ax e ]) 

fe=5 
oc 

< It ■ (c 5A/pQ - l) + ^ e -3 E [ A£ =] fc lnfc . £+ . ^ e (fe+1) A/pQ - 1 

oo 



fc=5 

/•OO 



/"OO 

< ( C 5A/ PQ -l)+y c -iE[AS e ]«.^+. ( c 2nA/ PQ _ ^ 

/ P 8 A /pq _ 1 , 8A Jpq \ 

" 6 I" + |E [Ax e ] — 2XI P Q J • 
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Now, due to Fact 17 in the appendix (with r = 1) and our assumption that E [Ax e ] > 1/64, we obtain 



E 



Ai e (Ax e ) < X-£t-IpQ- (5(e-l) + 



8 (c - 1) + 8 • 64 



■ . c • A • # • 



for some constant c. The first inequality holds if A < 1/512, proving Equation (2) if A is chosen small 
enough. 

Case 2: E [Ax e ] < ^. Again, in this case we can apply a Chernoff bound (Fact 16) to upper bound E A£ e (Ax e ) 



E 



A£ e (Ax e )l < P t Aie = fc l ' A ^( fc ) 

fc 



fe=i 



Ax e > 



E [Ax e 



E [Ax, 

< ^ e -fe(ln(fc/E[Ax e ])-l) .A/ e ( fe ) 



A4(fc) 



fe=l 



There are two sub-cases: 

Case 2a: x e > d. In order to bound the expected latency increase, we apply the elasticity bound on l e : 



E 



A4(A£ e 



< ^c-MM 
fe=l 



n/ E[AiJ)-l) . ^ e |f _ ^ 

< £+ . C- k On(fe)-ln(E[A£ c ])-l) . ^ _ ^ 

fc=l 
n 

< ei-J2( E t A ^] ( e?£ E [ A ^] fe_1 )) e ~ k (ln fe) • - i) 

n 

< e-E[Ai e ]-^c- fe ( lnfe ).(c^-l) . 

fc=l 

Now, splitting up the sum, we define 

Li - E [Ax e ] J2 e- fc ( lnfc )-(e^-l) 



fe=i 



< E[Ai e } { ° 8 l)d c- fe(lnfe) -fc 



fc=i 



e 8 d 
U < — • E [Ax e ] — 
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where the first inequality uses the observation that e~ < e 8 since k < [8x e / d\ , and Fact 17 (with 
r = 8). Additionally, where the third inequality uses the observation that Y^kLi e ~ k ' ^ — 2, 
and finally where the last inequality uses Inequality (4). 
For the second part of the sum, let 



L 2 = E [Ax e ] Yl e- fc ( lnfe )-(c^ 

CO 

< E[Az e ] ]T e - fc ( lnfe )+^ 

CO 

= E[A£ e ] e^ 1 "*" 1 ) 

CO 

< E [Ax e ] ^ c-? klnk 



-0 



(since x e > rf) 

(since fc > [^] > 8) 



< E [Ax e ] £ 



Due to Fact 18 and since x e > d 



E[AZ e 



8i e y 



< 



, [Az e 



A/ 



Reassembling the sum, we obtain 



E 



A£ e (Ax e ) < £t-(L 1 +L 2 ) 



< £t-[- + l) XI PQ 



Again, by the same arguments as at the end of Case 1 this proves Equation (2) if A is less than 

l/(2e 8 + 8). 



A£ e (Ax e ) 



into the section up to d 



Case 2b: x e < d. In this case we separate the upper bound on E 

and above d. For the first section we use the fact that each additional player on resource e causes 
a latency increase of at most v e as long as the load is at most d. We define the contribution to 
the expected latency increase by the events that up to d — x e join resource e, i. e., afterwards the 
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congestion is still at most d. In this case, we may use v e to bound the contribution of each agent: 

d — X e 



L\ < e^Wipfei)" 1 ) .fc„ e 

fc=i 

< c v e E [Ai e ] + i/ e E [Ax e f ^ e" fe (M*)-!) • k 

< e , e E[A* e ].(l + ^lj 

< 3z/ e E[Az e ] , 

where the third inequality holds since Y^k=2 e ~ k ^ ln ^ _1 ) • k < 8, and where the last inequality 
holds since E [Ax e ] < 1/64. 

For the contribution of the agents increasing the load on resource e to above d we use the elasticity 
constraint again. This time, we do not consider the latency increase with respect to lf(x e ) but 
with respect to 4(d): 



J2 e^( ln (^M). 4 ( d ).( c ^^_i) 

k= " 

As in case (2a), 



L 2 = 

k=d—x e -\-l 



L 2 < 4(d) • E [Ax e ] ■ e- klnk+k ~ (d ~ x " ) 

k—d—x e -\-l 



= £ e (d) ■ E [Ax e ] • J2 C- {k+{d - x ^ ^{k+id-xe))+k 
fc=l 

oo 

= 4(d) • E [Ax e ] ■ e -<- d - Xe) ■ e~ {k+{d ' x ' )] ^+{d-x e ))+k+d-x e 



k=l 

Consider the series in the above expression as a function of u = (d — x e ) and denote it by S(u). 
Note that S(u) converges for every u > and S(u) — > as u — > oo. In particular, S(u) < 8 for 
any u > 0, so 

L 2 < 8 4(d) ■ E [Aa; e ] • e ~( d_Xe ) 

< 8(4(2; e ) + (d-2;e)^)-E[A5 e ]-e-( d -^) . 

Since (d - x e ) • e"^"^ < 1/2, 

L 2 <4(40r e )+ i/ e )-E[A£ e ] . 

Altogether, 



E 



A4(Ai e ) 



< ii + L 2 

< 7v e E[Ax e ] + 4£ e (x e )E[Ax e ] 

< 7v e E[Ax e ]+4 XXe * PQ -e e (x e ) 

7 v e A\x e I PQ 

64 Uq d 
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where we have used Equation (4) for the third inequality, and the inequalities E [Ax e ] < 1/64 
and v > vq for the last step. Since x e < d and lp — Iq > v, 



E 



A4(Ai e ) 



•4) 



(^e) 



again proving Equation (2) if A < 1/32. 

Finally, the case e e P is very similar. □ 

Note that all migrating players add a negative contribution to the virtual potential gain since they migrate 
only from paths with currently higher latency to paths with lower latency. Hence, together with Lemma 2, we 
can derive the next corollary. 

Corollary 3. Consider a symmetric network congestion game T and let x and x' denote states of T such that 
x 1 is a random state generated after one round of executing the IMITATION PROTOCOL. Then, 

E[*(a/)] < 9(x) 

with strict inequality as long as x is not imitation-stable. Thus, $ is a super-martingale. 

It is obvious that the sequence of states generated by the IMITATION PROTOCOL terminates at an imitation- 
stable state. From Lemma 2 we can immediately derive an upper bound on the time to reach such a state. 
However, since for arbitrary latency functions the minimum possible latency gain may be very small, this 
bound can clearly be only pseudo-polynomial. To see this, consider a state in which only one player can 
make an improvement. Then, the expected time until the player moves is inverse proportional to its latency 
gain. 

Theorem 4. Consider a symmetric network congestion game in which all players use the IMITATION PRO- 
TOCOL. Let x denote the initial state of the dynamics. Then the dynamics converge to an imitation-stable 
state in expected time 

Q ( rfnl max $(s ) ' 
\ v 2 

Proof. By definition of the IMITATION PROTOCOL, the expected virtual potential gain in any state x' which 
is not yet imitation-stable is at least 



E 



X] V PQ (x',Ax') 
P,Qev 



A v 
<-v 

d 71 cma: 



Hence, also the expected potential gain E [A$(x')] in every intermediate state x' of the dynamics is bounded 
from above by at least half of the above value. From this, it follows, that the expected time until the potential 
drops from at most <E>(a;) to the minimum potential $* is at most 

dne max mx)-^>*) 

A v 2 

Formally, this is a consequence of Lemma 20 which can be found in the Appendix. □ 

It is obvious that this result cannot be significantly improved since we can easily construct an instance 
and a state such that the only possible improvement that can be made is v. Hence, already a single step takes 
pseudopolynomially long. In case of polynomial latency functions Theorem 4 reads as follows. 
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Corollary 5. Consider a symmetric network congestion game with polynomial latency functions with maxi- 
mum degree d and minimum and maximum coefficients a m ; n and o max , respectively. Let k — maxp £ j> \P\. 
Then the dynamics converges to an imitation- stable state in expected time 



Let us remark that all proofs in this section do not rely on the assumption that the underlying congestion 
game is symmetric. In fact, the lemma also holds for asymmetric congestion games in which each player 
samples only among players that have the same strategy space. 

3.2 Sequential Imitation Dynamics and a Lower Bound 

In the previous section, we proved that players applying the IMITATION PROTOCOL reach an imitation-stable 
state after a pseudopolynomial number of rounds. Recall that in this case each player decreases its latency 
by at least v if it were the only player to change its strategy. In this section, we consider sequential imitation 
dynamics such that in each round a single player is permitted to imitate someone else. Furthermore, we 
assume that each player changes its path regardless of the anticipated latency gain. Now, it is obvious that 
sequential imitation dynamics converge towards imitation-stable states as the potential $ strictly decreases 
after every strategy change. Hence, we focus on the convergence time of such dynamics. 

For such sequential imitation dynamics we prove an exponential lower bound on the number of rounds to 
reach an imitation-stable state. To be precise, we present a family of symmetric network congestion games 
with corresponding initial states such that every sequence of imitation leading to an imitation-stable state is 
exponentially long. To some extent, this results complements Theorem 4 as it presents an exponential lower 
bound in a slightly different model. However, in this lower bound v is arbitrary large and almost every state 
is imitation-state with respect to the IMITATION PROTOCOL. 

Theorem 6. For every n G N, there exists a symmetric network congestion game with n players, initial 
state S' n ", polynomial bounded network size, and linear latency functions such that every sequential imitation 
dynamics that start in S'"" is exponentially long. 

Subsequently, we do not give a complete proof of the theorem but we discuss how to adapt a series 
of constructions as presented in [1] which shows that there exists a family of symmetric network congestion 
games with the same properties as stated in the above theorem such that every best response dynamics starting 
in S iai ' is exponentially long. To be precise, they prove that in every intermediate state of the best response 
dynamics exactly one player can improve its latency. Recall that in best response dynamics players know the 
entire strategy space and that in each round one player is permitted the switch to the best available path. 

In the following, we summarize the constructions presented in [1]. At first, a PLS-reduction from the 
local search variant of MaxCut to threshold games is presented. In a threshold game, each player either 
allocates a single resource on its own or shares a bunch of resources with other players. Hence, in a threshold 
game each player chooses between two strategies only. The precise definition of these games is given below. 
Then, a PLS-reduction from threshold games to asymmetric network congestion games is presented. Finally, 
the authors of [1] show how to transform an asymmetric network congestion game into a symmetric one 
such that the desired properties of best response dynamics are preserved. All PLS-reductions are embedding, 
and there exists a family of instances of MaxCut with corresponding initial configurations such that in every 
intermediate configuration generated by a local search algorithm exactly one node can be moved to the other 
side of the cut. Therefore, there exists a family of symmetric network congestion games with the properties 
as stated above. 

A naive approach to prove a lower bound on the convergence time of imitation dynamics in symmetric 
network congestion games is as follows. Building upon the lower bound of the convergence time of best 
responses dynamics, a player for every path is added to the game. Then the latency functions are adopted 
accordingly. However, in this case we would introduce an exponential number of additional players. In 
threshold games, however, the players' strategy spaces have size two only. Hence, we could apply this 
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approach to threshold games. In the following, we present the details of this approach. It is then not difficult 
to verify that the PLS-reductions mentioned above can be reworked in order to prove Theorem 6. However, 
note that this does not imply that computing a imitation-stable state is PLS-complete since one can always 
assign all players to the same strategy which obviously is an imitation-stable state. 

Threshold games are a special class of congestion games in which the set of resources 1Z can be divided 
into two disjoint sets lZ in and TZ M . The set 1Z M contains exactly one resource j-j for every player i G Af. 
This resource has a fixed latency Tj called the threshold of player i. Each player i has only two strategies, 
namely a strategy Sf" = {r^} with ti G TZ M , and a strategy Sf C lZ in . The preferences of player i can be 
described in a simple and intuitive way: Player i prefers strategy Sf to strategy Sf if the latency of Sf is 
smaller than the threshold 7*. Quadratic threshold games are a subclass of threshold games in which the set 
TZ ia contains exactly one resource for every unordered pair of players {i, j} C Af. Additionally, for every 
player i G Af of a quadratic threshold game, Sf — {rij \ j G Af,j ^ i}. Moreover, for every resource 
Tij G 1Z,„: l Tij (x) = dij ■ x with G N, and for every resource r,: £ rj (x) = 1/2 J2j^i a ij ' x to r i- 

Let T be a quadratic treshold game that has an initial state 5 inil , such that every best response dynamics 
which starts S mh is exponentially long, and every intermediate state has a unique player which can improve 
its latency. Suppose now that we replace every player i in Y by three players i\, i 2 and i 3 which all have the 
same strategy spaces as player i has. Additionally, suppose that we choose new latency functions £' for every 
resource T{ as follows: £' r .(x) = 1/2 J2j^i a ij • x + 3/2 a ij- Hence, we add an additional offset of 

Suppose now that we assign every player i\ to Sf", and every player i 2 to Sf. For every possible strategy 
that the i 3 players can use, their latency increases by 2^.^ f Oy, compared to the equivalent state in the 
original game, in which every player i chooses the same strategy as player « 3 does. Hence, if if we assign 
every player i 3 to the strategy chosen by player i in S mn and if the players i\ and i 2 were not permitted to 
change their strategies, then we would obtain the desired lower bound on the convergence time of imitation 
dynamics in threshold games. However, since also i\ and i 2 are permitted to imitate, it remains to show 
that whenever player i 3 has changed its strategy, then both i\ and i 2 do not want to change their strategies 
anymore. 

First, suppose that player i 3 switches from the strategy of player i 2 to the strategy of player i\. Obviously, 
player i\ does not want to change its strategy as otherwise 13 would not have imitated i\. Suppose now that 
i2, whose strategy is dropped by i 3 , also wants to imitate i\. In this case all three players would allocate Sf\ 
and hence have latency 3 J^rej^i ai r However, if player i 2 would stay with strategy S'° then its latency is 
upper bounded by 2 X)reS in ai i • Hence, players i\ , i 2 , is will never select S OM at the same time. 

Second, suppose that player i 3 switches from the strategy of player i\ to the strategy of player i 2 . Now, 
player i 2 does not want to change its strategy as otherwise « 3 would not have imitated i 2 . Suppose now that 
ii, whose strategy is dropped by i 3 , also wants to imitate i 3 . In this case, the latency would increase to at least 

3 J2rej^i ai i' wnereas player i\ would have latency 2 J2 r ej^i ai i ^ ^ would sta Y with strategy S° M . Hence, 
players i\, i 2 , i 3 will never select S ia at the same time. 

By applying the argument that all three players never allocate the same strategy at the same point in time 
we can conclude our claim and Theorem 6 follows. 

4 Fast Convergence to Approximate Equilibria 

Theorem 4 guarantees convergence of concurrent imitation dynamics generated by the IMITATION PROTO- 
COL to an imitation-stable state in the long run. However, it does not give a reasonable bound on the time due 
to the small progress that can be made. Hence, as our main result, we present bounds on the time to reach an 
approximate equilibrium. Here we relax the definition of an imitation-stable state in two aspects: We allow 
only a small minority of agents to deviate by more than a small amount from the average latency. Our notion 
of an approximate equilibrium is similar to the notion used in [6, 15, 17]. It is motivated by the following 
observation. When sampling other players each player gets to know its latency if it would adopt that players' 
strategy. Hence to some extend each player can compute the average latency L+ and determine if its own 
latency is above or below that average. 
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Definition 1 ((5,e,^)-equilibrium). Given a state x, let the set of expensive paths be Vf v = {P € V : 

£p(x) > (1 + e) L+ + v} and let the set of cheap paths be V~ v = {P G V : £p(x) < (1 — e) L av — is}. Let 
Ve.v = Ptv U P7v A configuration x is at a (i5,e,^)-equilibrium iff it holds that ^2 Pe -p e ,„ xp < 5 ■ n. 

Intuitively, a state at ((5,e,^)-equilibrium is a state in which almost all agents are almost satisfied when 
comparing their own situation with the situation of other agents. One may hope that it is possible to reach 
a state in which all agents are almost satisfied quickly . This would be a relaxation of the concept of Nash 
equilibrium. We will argue below, however, that there is no rapid convergence to such states. 

Theorem 7. For an arbitrary initial assignment xq, let r denote the first round in which the IMITATION 
PROTOCOL reaches a (8,e,v)-equilibrium. Then, 

$(x ) 



Proof. We consider a state x(t) that is not at a ((5,e, ^-equilibrium and derive a lower bound on the expected 
potential gain. There are two cases. Either at least half of the agents utilizing paths in V e ,„ utilize paths in 
Vf v or at least half of them utilize paths in V~ v . 

Case 1: Many agents use expensive paths, i.e., J2pev + Xp — ^ n /^- Let us define the volume T and 
the average ex-post latency C of potential destination paths, i. e., paths with ex-post latency at most 
(l + e)L+by 



r= E 

Q-.e+<(l+e)L+ 



XQ_ 

n 



and C = 



E 



X Q o+ 



T — n 

Q:£+<(l+e)L+ 



Clearly, 



L+=Y,^tt>T-C+(l-T)-(l + e)L+ , 



and solving for T yields 



T > 



"■ Li, 



(1 + e) L+ - C 



(7) 



We now give a lower bound on the expected virtual potential gain given that the current state is not at 
a (5,e,^)-equilibrium. We consider only the contribution of agents utilizing paths in Vf v and sampling 
paths with ex-post latency below (1 + e) L+. Then, 



E 



£Vpq 

P,Q 



< 



X 



± V xp V 

P6Pt Q:i+<(l+e)L+ 



x q tp- Zq{x + 1q-1 p ) 



A 



n 



lp-t? 



PeV+v Q:i+<(l+e)L+ 

Using Jensen's inequality (Fact 19) and substituting lp > yields 



(£p - £q(x + 1 Q - 1 P 



E 



P,Q 



< 



E 



perl 




XQ_ 

Q:£+<(l+e)L+ n 



Now we substitute £ P > (1 + e) £+ and use the fact that the squared expression is monotone in £ P . 
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Furthermore, we substitute the definition of T and C to obtain 



E 



P,Q 



< 



X L + £ x j T(l + e)L+-E Q ^< (1+£)L i X -^ \ I 



Pev: 



(1 + e)L+ 



T 



(1 + e)L+ 



A,+ f(l + e)L+-C 



T- ^ *p • 

Pev+ 



(1 + e)L a + v 

We can now use the tradeoff shown in Equation (7), C < L+, and J2pev+„ Xp > 5n/2to obtain 



E 



A (l + e)L+-C + ^ 

A e (5 n 

< . e . £L_ . — 

d (1 + e) 2 2 



Since nL+ > we have by Lemma 2 



E[$(x(t + 1))] < $(x(t)) - -E 



e 2 -<5 



• n 



L± 



£Vpq 

P,Q 



< <f>(x{t)) ( 1 SI 



rf 



Case 2: Many agents use cheap paths, i. e., J2pcp~ Xp — ^ ^ ms ti 1116 ' we define the volume T and 
average latency C of paths which are potential origins of agents migrating towards V~ v . 



£ 



and C = 
n T <-^> n 

Q-.e Q >(l-e)L„ Q:£ Q >(l-e)L„ 



T E/ 



X Q , 



This time, 
implying 



iav < T ■ C + (1 - T) • (1 - e) L av 



(8) 



C-(l-e)L av ' 

Similarly as in Case 1 we now give a lower bound on the contribution to the virtual potential gain 
caused by agents with latency at least (1 — e)L av sampling agents in V~ v . 



E 



£Vpq 

P,Q 



d 



E x q £ q E V 



xp ( Iq - £ P 



Q-.e Q >(l-e)L d , 



pev: 
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we rearrange the sum, apply Jensen's inequality (Fact 19) to obtain 



E 



P,Q 



< 



< 



d E E 



X Q @Q (tQ-tp 



A y 
d ^ 



Xp 



E 



Pev- 



iQ:< Q >(l-e)L» 



^Q:£ Q >(l-e)L av „ 



A y 
d ^ 



Xp 



\Q:£ Q >(l- e )L„ / 



= -3 E 



Pevr v 



CT 



< - 



JT.(C-(1- £ )L„)) 2 -^. J2 
Finally, using Equation (8) and CT < L. dv , 



x P 



Per; 



E^Q 

P,Q 



< 



< 



< 



x P 



Pevi 



-Sn 



6e 2 <S> 
d 



In both cases, the potential decreases by at least a factor of (1 — Q(e 2 S/dj) in expectation, which, by 
Lemma 20, implies that the expected time to reach a state with < is at most the time stated 

in the theorem. □ 

From Theorem 7 we can immediately derive the next corollary. 

Corollary 8. Consider a symmetric network congestion game with polynomial latency functions of maximum 
degree d and with minimum and maximum coefficients a max and a mm , respectively. If all players use the 
IMITATION PROTOCOL, then the expected convergence time of imitation dynamics to an (5,e,v)-equilibrium 
is upper bounded by 

G "tt ■ log [n m — 

Let us remark, that ((5,e,^)-equilibria are transient, i.e., they can be left again once they are reached, 
for example, if the average latency decreases or if agents migrate towards low-latency paths. However, our 
proofs actually do not only bound the time until a (5, e, ^-equilibrium is reached for the first time, but rather 
the expected total number of rounds in which the system is not at a ((5,e,^)-equilibrium. 

Note that in the definition of (<5,e,i/)-equilibria we require the majority of agents to deviate by no more 
than a small amount from L+. This is because the expected latency of a path sampled by an agent is i a v, but 
the latency of the destination path becomes larger if the agent migrates. We use L+ as an upper bound in our 
proof, although we could use a slightly smaller quantity in cases where the origin Q and the destination P 
intersect, namely (p(x + lp — 1q). Using an average over P and Q of this quantity rather than would 
result in a slightly stronger definition of (<5,e,^)-equilibria. However, we go with the definition as presented 
above for the sake of clarity of presentation. 
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Let us conclude this section by showing that there are fundamental limitations to fast convergence. One 
could hope to show fast convergence towards a state in which all agents are approximately satisfied, i. e., 
S = 0. However, any protocol that proceeds by sampling either a strategy or an agent and then possibly 
migrates, takes at least expected time O(n) to reach a state in which all agents sustain a latency that is within 
a constant factor of L+. To see this, consider an instance with n = 2m agents and identical linear latency 
functions. Now, let x\ = 3, X2 = 1 and x, = 2 for 3 < i < n. Then, the probability that one of the players 
currently using resource 1 samples resource 2 is at most O (1/m) = 0(l/n). Since this is the only possible 
improvement step, this yields the desired bound. 

5 Imitation Dynamics in Singleton Games 

In this section, we improve on our previous results and consider imitation dynamics in the special case of 
singleton congestion games. A major drawback of the IMITATION PROTOCOL is that players who rely on 
this protocol cannot explore the complete set of edge if the dynamics start in a state in which some edges are 
unused. Even worse, the event that an edge becomes unused in later states, although it has been used in the 
initial state, is not impossible. It is clear, however, that when starting from a random initial distribution of 
players among the edges, the probability of emptying an edge becomes increasingly unlikely as the number 
of players increases. 

Subsequently, we formalize this statement in the following sense. Consider a family of singleton con- 
gestion games over the same set of edges with latency functions without offsets. Then, the probability that 
an edge becomes unused is exponentially small in the number of players. To this end, consider a vector of 
continuous latency functions C = (£ e )ie[m] with £ e ■ [0, 1] — > K>o- To use these functions for games with a 
finite number of players, we have to normalize them appropriately. For any such function I 6 £, let £ n with 
£ n (x) = £(x/n) denotes the respective scaled function. We may think of this as having n agents with weight 
1 jn each. Note that this transformation leaves the elasticity unchanged, whereas the step size v decreases as 
n increases. For a vector of latency functions C = (£ e )ie[ m ]> let C n = {£e)ie[m]- 

Theorem 9. Fix a vector of latency functions C with £ e (0) — for all i <E [to]. For the singleton congestion 
game over C n with n players, the probability that the IMITATION PROTOCOL with random initialization 
generates a state with x e = for some i € [to] within poly(n) rounds is bounded by 2 _n ( n ). 

Proof. Let d denote an upper bound on the elasticity of the functions in C, and let opt c = minj,{L av (y)} 
where the minimum is taken over all y E {y' G M> | YleVe = !}• I n other words, opt £ corresponds 
to the minimum average latency achievable in a fractional solution. For any e G [to], by continuity and 
monotonicity, there exists an y e > such that £ e (y e ) < opt c /4 d and y e < 1/m. 

Consider the congestion game with n players and fix an arbitrary edge e e [to]. In the following, we 
upper bound the probability that the congestion on edge e falls below ny e /2. First, consider the random 
initialization in which each resource receives an expected number of n/m agents. The probability that x e < 
ny e /2 < n/ (2 to) is at most 2~ n ^ ny "\ Now, consider any assignment x with Xj > nyj/2 for all e € [to]. 
There are two cases. 

Case 1: x e > y e n. Since in expectation, our policy removes at most a X/d fraction of the agents from edge 
e, the expected load in the subsequent round is at least (1 — A/eQ x e . Since for sufficiently small A it 
holds that 1 — A/rf>3/4, we can apply a Chernoff bound (Fact 16) in order to obtain an upper bound 
of 2~^ x ^ for the probability that the congestion on e decreases to below x e /2 > y e n/2. 

Case 2: y e n/2 < x e < y e n. Hence, l™(x e ) < opt c /A d . In the following, let n~ denote the number of 
agents on edges r with £™(x r + 1) < £™(x e ), and let n + denote the number of players utilizing edges 
with latency above opt £ . There are two subcases: 

Case 2a: n~ = 0. Then, the probability that an agent leaves edge e is 0. 
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Case 2b: n~ > 1. We first show that n + > 4 max{fi _ ,s e }. For the sake of contradiction, assume 
that n + < 4n~. Now, consider an assignment where all of these players are shifted to edges 
r with latency £™(x r ) < C( x e) ^ °pt£ /4 d , where edge r receives n + ■ x r /n~ (fractional) 
players. In this assignment, the congestion on all edges is increased by no more than a factor of 
n + /n~ < 4. Hence, due to the limited elasticity, this increases the latency by strictly less than 
a factor of 4 d . Then, all edges have a latency of less than opt £ /4 ■ 4 = opt L and some have 
latency strictly less than opt L , a contradiction. The same argument also holds if we consider only 
resource e rather than all resources r considered above. Hence, also n + > 4 x e . 
Now, consider the number of players leaving edge e. Clearly, 

m r . -i A x — x r A ti 

E[AX-]<x e .- £ -=».-^- 

All players with current latency at least opt £ can migrate to resource e since the anticipated 
latency gain is larger than v. Hence, the number of players migrating towards e, is at least 



E[AX+] > 



> 



Xx e -(^{x r )-r e {x e + l)) 

frf Xr ' nd£?(x r ) 

r:t? (a; r )>opt £ rV ' 

Ax e v £ n r (x r )~2 d -£ n e {x e ) 
nd ■ ^ Xr ' i?{x r ) 

r:£"(x r )>opt c ry ' 



> ^£.(i_JL). n + 
- nd v 2 d ' 

A r _ , 

> 2 • x e ■ — — maxjn ,x e \ . 

dn 

The third inequality holds since £™ > opt c and £™ < opt £ / A d and the last inequality holds since 
d > 1. For any T > it holds that 

P [AX e > 0] > P [(AX+ > T) A (AX- < T)] 

> (1 - P [AX+ < T]) ■ (1 - P [AX- >T]) . 

Due to our lower bounds on E [AX+] and E [AJ~] we can apply a Chernoff bound (Fact 16) 
on these probabilities. We set T = 1.5 A max{x e , n~} x e / (dn) which is an upper bound on 
E [AX~] and a lower bound on E [AX+], so 

P[AX+<T] < 2- n ^ < 2-"( A -e/(rf«)) and 

P [AX~ >T] < 2" f2(T) < 2- n(Xx */ ( - dn » 

Altogether, 

P [AX e > 0] > 

= 1 

Finally, since x e > ny e /2, P [AX e < 0] < 2" n ( A ™fe/rf) = 2- Q(x -\ 

In all cases, the probability that the edge becomes unused is bounded by 2~ n ( x °' > = 2~°("). Hence, the same 
holds also for m = poly(n) edges and poly(n) rounds. □ 

The proof does not only show that edges do not become empty with high probability, but also that the 
congestion does not fall below any constant congestion value. In particular, for the constant d this implies 
that with high probability the dynamics never reach case 2b of the proof of Lemma 2. This is the only place 
where our analysis relies on the parameter v. Hence, for a large number of players we can remove it from the 
protocol and the dynamics converge to an exact Nash equilibrium. 
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5.1 The Price of Imitation 



In the preceding section we have seen that it is unlikely that resources become unused when the granularity 
of an agent decreases. If the instance, i. e., the latency functions and the number of users, is fixed, it is an 
interesting question, how much the performance can suffer from the fact that the IMITATION PROTOCOL is 
not innovative. We measure this degradation of performance by introducing the Price of Imitation which 
is defined as the ratio between the expected social cost of the state to which the IMITATION PROTOCOL 
converges, denoted Ir, and the optimum social cost. The expectation is taken over the random choices of the 
Imitation Protocol, including random initialization. 

We answer this question here for the case of linear latency functions of the form £ e (x) = a e x. Then, 
d = 1 is an upper bound on the elasticity and v = a max = max ee s{a e }. Choosing the average latency 
SC{x) — ^Z e£E {x e /n) -£ e (x e ) as the social cost measure, we show that the Price of Imitation is bounded by 
a constant. It is, however, obvious that the same also holds if we consider the makespan, i. e., the maximum 
latency, as social cost function. 

The performance of the dynamics can be artificially degraded by introducing an extremely slow edge. 
Thus, a max can be chosen extremely large such that any state is imitation-stable. However, such a resource 
can be removed from the instance without harming the optimal solution at all since it would not be used 
anyhow. We will call such resources useless and make this notion precise below. 

Let us first define some quantities used in the proof. For a set of resources M, let Am — X) e eM an< ^ 
let Ap = A[ m y For M C [m] let T \ M denote the instance obtained from T by removing all resources in 
M. In the proof, we do not compare the outcome of the IMITATION PROTOCOL to the optimum solution, but 
rather to a lower bound, namely the optimal fractional solution. The optimal fractional solution x e can be 
computed as x e — n/(Ar a e ). For this solution, the latency of all resources is a e ■ x e = n/Ar- A resource 
is useless if x e < 1. In the following, we assume that there are no useless resources. Then, we can show that 
the social cost at an imitation-stable state in which all resources are used, does not differ by more than a small 
constant from the optimal social cost (Lemma 11) and that the Price of Imitation is small. In fact, whereas 
x e > 1 is required for Lemma 1 1, we here need a slightly stronger assumption, namely that x e = fi(log n). 

Theorem 10. Assume that for the optimal fractional solution, x e = Sl(logn) large enough. The price of 
imitation is at most (3 + o(l)). In particular, for 5 > 0, and any n > n (5) for a large enough value no (S) 
(which is independent of the instance), 

We start by proving two lemmas. 
Lemma 11. Let x be a state in which no agent can gain more than a max . Then, 

< SC(x) < 3-^ • 
A T ~ A r 



Proof. The lower bound has been proven above since n/Ar is the social cost of an optimal fractional solution. 
Also note that, since there are no useless resources, x e > 1 and hence n/A-p > a max . 

For the upper bound, consider a state x in which no agent can gain more than a max . For the sake of 
contradiction assume that there exists a resource e <G [to] with £ e (x e ) > 3 n/Ar ■ Since x ^ x there exists a 
resource / =^ e with Xf < x/. In particular, £f(xf + 1) < n/Ar + »max < 2n/Ar < £ e {x e ) ~ amax- The 
last inequality holds due to our assumption on £ e (x e ) and since n/Ar > dmax- Hence, any agent on resource 
e can improve by a max by migrating to /, a contradiction. □ 

Lemma 12. The IMITATION Protocol converges towards an imitation-stable state in time O (n 4 logn). 
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Proof. Consider a state x(t) in which there is at least one agent who can make an improvement of a max . 
Since its current latency is at most n ■ a max and the probability to sample the correct resource is at least 1/n, 
the probability to do so is at least A • (1/n) ■ (a max / (n a max )) = X/n 2 and the virtual potential gain of such a 
step is a max > <f>/re 2 . Hence, the expected virtual potential gain in state x(t) is at least A &(x(t))/n 4 . Hence, 
by Lemma 2, 

Note that $* > rea m ; n and a max < na m ; n by the assumption that no resource is useless. Also, $(x(0)) < 
n 2 a max . Now, the theorem is an application of Lemma 21 in the appendix. □ 

Based upon the proof of Theorem 9 we can now bound the probability that a resource becomes empty for 
the case of linear latency functions more specifically. 

Lemma 13. The probability that all resources of the subset M C [m] become empty in one round simultane- 
ously is bounded from above by 

e£M 

Proof. Recall the bounds on the probability that a resource e G [m] becomes empty in the proof of Theorem 9. 
Since we now consider linear latency functions, we may explicitly compute the value of y e — l/(Ar a e ). 
Recall the two cases and the failure probability in the initialization: 

Initialization: Here, the error probability was at most 2~ n (" = 2 -S1 ( A r"«e) . 

Case 1: x e > y e n. Here, the error probability was at most 2~ n ( x ' ~> = 2 Q (~ 



A? 



Case 2: y e re/2 < x e <y e re. Here, the error probability was at most 2 a ( x e/«) = 2 Q ((-*r »e) z ) . 

In all cases, the probability that resource i becomes empty is at most 2 51 ( J r"« e ) . 

Furthermore, consider resources e and e' and let E and E 1 denote the events that e and e' become empty, 
respectively. It holds that, P [£" | E] < P [E 1 ]. Therefore, P [E n E'} = P [E] ■ P [£" | E] < P [E] ■ P [£"]. 
Extending this argument to several resources yields the statement of the lemma. □ 

Using the above two lemmas, we can now prove the main theorem of this section. 

Proof of Theorem 10. The proof is by induction on the number of resources m. Clearly, the statement holds 
for m = 1, in which case there is only one assignment. In the following we divide the sequence of state 
generated by the IMITATION PROTOCOL into phases consisting of several rounds. The phase is terminated 
by one of the following events, whatever happens first: 

1 . A subset of resources M becomes empty. 

2. The Imitation Protocol reaches an imitation-stable state. 

3. The protocol enters round <d(n 5 logn). 

If a phase ends because Event 1 occurs, we start a new phase for the instance T \ M. If it ends because of 
Event 3, we start a new phase for the original instance. 

The probability for Event 1 is bounded by Lemma 13. Note that the probability is also bounded for up 
to poly (re) many rounds. If a phase ends with Event 2 we have Ir < 3 (Lemma 11). We bound the 
probability of this event by 1, which is trivially true. Event 3 happens with a probability at most O (1/re). 
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This can be shown using Lemma 12 and Markov's inequality. Note that the expected social cost is still at 
most Jr. Summing up over all three events, we obtain the following recurrence: 

Ir< £ n^'V + ^+oQ./r 

MC[m]eeM V 7 



implying 



e n ■ 'n« - 

MC[m] e£M 

Substituting the induction hypothesis for 7t\m> an d introducing a constant c for the constant in the f2(), 

1' 



n 



MC[m] eeM 



n , n 
= 3 '-i-+ 4 ^ 



E 2- 



M C [ml 



^4r\M 

Ay 

Ar\M 



Now, by our assumption that for all e e M, a; e = n/ (Ay ■ a e ) > fi(logn), we know that for all e, l/a e > 
c' Ar • log n/n for a constant c' which we may choose appropriately. In particular, Am > | M | c' Ar • log n/n 
and ^r\M > c' Ar • log n/n. Altogether, 



Jr • 1 - O 



< 



< 



< 



< 



n 

At" 


3 + 4 £ 2 

y MC [m] 


n 

At" 


/ m — 1 / 

\ fc=i v 


n 

At" 


/ m — 1 

3 + 4 £ n fc • 
V fc=i 


n 

At" 


/ m — 1 

3 + 4^2^ 

V fc=i 


n 

At" 


/ m— 1 ( 

3 + 4 E c 

V k=i 


(3 + 





' \M\ logn 



d logn 



2 — c c' fc log n 



d logn 



c' logn 



c' logn 



cc'-l) fc+1 



since the last sum is bounded by o(n). This implies our claim. 



□ 



6 Exploring New Strategies 

In Section 3, we have seen that, in the long run, the dynamics resulting from the IMITATION PROTOCOL 
converges to an imitation-stable state in pseudopolynomial time. The IMITATION PROTOCOL and the concept 
of an imitation-stable state have the drawback that the dynamics can stabilize in a quite disadvantageous 
situation, e.g. when all players play the same expensive strategy. This is due to the fact that the strategy space 
is essentially restricted to the current strategy choices of the agents. Strategies that might be attractive and 
offer a large latency gain are "lost" once no player uses them anymore. 

A stronger result would be convergence towards a Nash equilibrium. In the literature, several other pro- 
tocols are discussed. For all of the protocols we are aware of, the probability to migrate from one strategy to 
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another depends in some continuous, non-decreasing fashion on the anticipated latency gain, and it becomes 
zero for zero gain. Hence, in a setting with arbitrary latency functions which we consider here, there always 
exist simple instances and states that are not at equilibrium and in which only one improvement step is possi- 
ble which has an arbitrarily small latency gain. Hence, it takes pseudopolynomially long, until an exact Nash 
equilibrium is reached. Still, it might be desirable to design a protocol which reaches a Nash equilibrium in 
the long run. There are several ways to achieve this goal. We will discuss three of them here. 

Theorem 9 states the following for a particular class of singleton congestion games. With an increasing 
number of players it becomes increasingly unlikely that useful strategies are lost. This allows to omit the 
parameter v from the protocol. If no strategies are lost for a long period of time, the dynamics will converge 
towards an exact Nash equilibrium. 

Second, we may add an additional "virtual agent" to every strategy, such that the probability to sample a 
strategy never becomes zero. This has two implications on our analysis. On the one hand, there is a certain 
base load on all resources, denoted by x° e . We then need to have an upper bound on the elasticity of l e [x — x° e ) 
which may be larger than the elasticity of i e (x) itself. Furthermore, we have to add [P\ virtual agents, which 
leaves the analysis of the time of convergence unchanged only if n = Cl(\V\). 

As a third alternative, we can add an exploration component to the protocol. With a probability of 1 /2, the 
agents can sample another path uniformly at random rather than another agent. In this case, however, the elas- 
ticity d cannot be used as a damping factor anymore, since the expected increase of congestion may be much 

larger than the current load. Rather, we have to reduce the migration probability by a factor min j 1 , ^j/^ in j 

where (3 is an upper bound on the maximum slope and £ m j n = min ee e ^e(l) is the minimum latency of an 
empty resource. 

Protocol 2 Exploration Protocol, repeatedly executed by all players in parallel. 
Let P denote the path of the player in state x. 
Sample another path Q e V uniformly at random. 
\lt P (x) > l Q {x + 1 Q - 1 P ) then 
with probability 

fJ-PQ — mm 1 1, A • — — 

migrate from path P to bin Q. 
end if 



Lemma 14. Let x denote a state and let Ax denote a random migration vector generated by the EXPLO- 
RATION Protocol. Then, 



E 



[A<S>{x,Ax)\ < \ Y, nVp Q {x,Ax)} 



p,Qer 

Proof. Recall that Lemma 1 states the following for every state x and every migration vector Ax 

A^(x,Ax) < V PQ {x,Ax) + Y F e{x,^x) . 

P,QeV e£E 

Now, in order to proof Lemma 14, we apply the same approach as in the proof of Lemma 2. Hence, it remains 



to adapt the upper bound on E 



A£ e (Ax e ) 



to the Exploration Protocol. Note that this is quite simple, 
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since due to the linearity of expectation, 



E A4(Ai e ) < /3E[Ax e ] 

4nn|P| 1 Zp-£q 



(3n \V\ 



l Q 

where we have substituted the migration probability of the protocol and the fact that there are at most n agents 
that may sample a path containing e. This proves Equation (2) if A is chosen small enough. With opposite 
signs, the same argument holds if e € P. □ 

Since we have omitted the parameter v from the protocol, we now need a lower bound on the minimum 
improvement that is possible when the system is not yet at an imitation-stable state in order to give an upper 
bound on the convergence time. Formally, let 

K = min min {(-p{x) — Iq(x + 1q — lp)} . 

P,QeV 
l v (x) > £ Q (x + l Q -l P ) 

Theorem 15. Consider a symmetric network congestion game in which all players use the EXPLORATION 
PROTOCOL. Let x denote the initial state of the dynamics. Then the dynamics converge to a Nash equilibrium 
in expected time 

f<S>(x){3nt n 

Proof. In every state which is not a Nash equilibrium there exists an agent currently utilizing path P 6 V 
and a path Q 6 V such that £q < ip — n. Hence, the expected virtual potential gain is at least 

and the true potential gain is at least half of this. Again, Lemma 20 yields the expected time until the potential 
decreases from at most $ to $* > 0. □ 

It is obvious that an analogue of Lemmas 2 and 14 also holds for any protocol that is a combination of 
the Imitation Protocol and the Exploration Protocol, e.g., a protocol in which in every round, 
every agent executes the one or the other with probability one half. Then, in order to bound the value of 

E A£ e (Ax e ) , we must make a case differentiation based on whether proportional or uniform sampling 

dominates the probability that other agents migrate towards resource e. Such a protocol combines the advan- 
tages of the Imitation Protocol and the Exploration Protocol: In the long run, it converges to a 
Nash equilibrium, and reaches an approximate equilibrium as quickly as stated by Theorem 7 (up to a factor 
of 2). 



7 Conclusion 

We have proposed and analyzed a natural protocol based on imitating profitable strategies for distributed self- 
ish agents in symmetric congestion games. If agents use our IMITATION PROTOCOL, the resulting dynamics 
converge rapidly to approximate equilibria, in which only a small fraction of players have latency significantly 
above or below the average. In addition, in finite time the dynamics converges to an imitation-stable state, 
in which no player can improve its latency by more than v by imitating a different player. The IMITATION 
PROTOCOL and the concept of an imitation-stable state have the drawback that dynamics can stabilize in a 
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quite disadvantegous situation, e.g. when all players play the same expensive strategy. This is due to the fact 
that the strategy space is essentially restricted to the current strategy choices of the agents. Strategies that 
might be attractive and offer large latency gain are "lost" once no player uses them anymore. For singleton 
congestion games we showed that this event becomes unlikely to occur as the number of players increases. 
Then, by removing parameter v from the protocol, the dynamics become likely to converge to Nash equilibria. 
Another approach to avoid losing strategies is to include exploration of the strategy space. Towards this end, 
we can use an EXPLORATION PROTOCOL, in which players sample from the strategy space directly and then 
migrate with a certain probability. If every player uses a suitably designed EXPLORATION PROTOCOL (or 
any random combination of EXPLORATION PROTOCOL and IMITATION PROTOCOL), then the dynamics are 
always guaranteed to converge to a Nash equilibrium. However, acquiring information about possible strate- 
gies and their benefits might be a complex and costly process in practice, and hence such an action should 
be invoked only rarely. In addition, exploration requires small migration probabilities, because the danger of 
overshooting is more severe. Thus, on the downside, if the EXPLORATION PROTOCOL is used exclusively, 
this results in significantly larger convergence times. 
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A Appendix 



A.l Useful Facts 

Throughout the technical part of this paper, we will apply the following two Chernoff bounds. 

Fact 16 (Chernoff, see [19]). Let Xbea sum of Bernoulli variables. Then, P [X > k ■ E [X]} < e~ E W fc-i), 
and, fork > 4 > c 4 / 3 , P [X > k • E [X]} < e~i E M fc ln \ Equivalently, for k > 4E[X], F [X > k] < 

Q -\k ln(fe/E[X])_ 

The following fact yields a linear approximation of the exponential function. 
Fact 17. For any r > one/ x G [0, r], it holds that (e x - 1) < x ■ 

Proof. The function cxp(a;) — 1 is convex and it goes through the points (0, 0) and (r, e r — 1), as does the 
function x ■ □ 

r 

Fact 18. For every c e]0, l[it holds 

c 

l-c 



E 

oo 

k=l 



c k = 



Fact 19 (Jensen's Inequality). Let f : E — > R be a convex function, and let a\, . . . ,ak,x\, . . . ,Xk 6l Then 



Iff(x) = x 2 , then 



i=l a i J E l= l a 



1 / k y k 

O ' E aiXi ) - E a if( X i) 

Ei=l a i \i=l ) i=l 



Lemma 20 ([14]). Let Xq, X\, . . . denote a sequence of non-negative random variables and assume that for 
alli>0 

E [Xi | = Xi-x] < Xi-x - 1 
and let r denote the first time t such that X t — 0. Then, 

E[t\X = x ] < x . 



Lemma 21 ([14]). Let Xo,X\, . . . denote a sequence of non-negative random variables and assume that for 
all i > E [Xi | Xi—i = Xi—i] < Xi-i ■ a for some constant a G (0, 1). Furthermore, fix some constant 
x* G (0, Xo] and let r be the random variable that describes the smallest t such that X t < x*. Then, 

E[T\X =X ]<—^—-\og(^) . 

log(l/a) \x* J 

Again, as a consequence of Lemma 20 the expected time until the potential decreases from at most $ to $ 
can be found in the appendix, and which is proved, e. g., in [14 ]. 
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